### Aarch64 most common instructions

General conventions
rd, rn, rm: w or x registers; op2: register or #immn (n-bit immediate)

Containers: x (64-bit register), w (32-bit register)

|            | Instruction                                 | Mnemonic | Syntax                            | Explanation                                                                                                                   | Flags |
|------------|---------------------------------------------|----------|-----------------------------------|-------------------------------------------------------------------------------------------------------------------------------|-------|
|            | Addition                                    | ADD{S}   | ADD{S} rd, rn, op2                | rd = rn + op2                                                                                                                 | {Yes} |
|            | Subtraction                                 | SUB{S}   | SUB{S} rd, rn, op2                | rd = rn - op2                                                                                                                 | {Yes} |
|            | Negation                                    | NEG{S}   | NEG{S} rd, op2                    | rd = -op2                                                                                                                     | {Yes} |
|            | with carry                                  | NGC{S}   | NGC{S} rd, rm                     | rd = -rm - ~C                                                                                                                 | {Yes} |
|            | Unsigned multiply                           | MUL      | MUL rd, rn, rm                    | rd = rn x rm                                                                                                                  | (,,,, |
|            | Unsigned multiply long                      | UMULL    | UMULL xd, wn, wm                  | xd = wn x wm                                                                                                                  |       |
|            | Unsigned multiply high                      | UMULH    | UMULH xd, xn, xm                  | xd = <127:64> of xn x xm                                                                                                      |       |
| l s        | Signed multiply long                        | SMULL    | SMULH xd, xn, xm                  | xd = wm x wn (signed operands)                                                                                                |       |
| tior       | Signed multiply high                        | SMULH    | SMULL xd, wn, wm                  | xd = <127:64> of xn x xm (signed operands)                                                                                    |       |
| pera.      | Multiply and add                            | MADD     | MADD rd, rn, rm, ra               | rd = ra + (rn x rm)                                                                                                           |       |
| ď          | Multiply and sub                            | MSUB     | MSUB rd, rn, rm, ra               | rd = ra - (rn x rm)                                                                                                           | +     |
| ţi         | Multiply and neg                            | MNEG     | MNEG rd, rn, rm                   | Rd = -(rn x rm)                                                                                                               | +     |
| h e        | Unsigned multiply and add long              | UMADDL   | UMADDL xd, wn, wm, xa             | xd = xa + (wm x wn)                                                                                                           |       |
| Ŀ          | Unsigned multiply and sub long              | UMSUBL   | UMSUBL xd, wn, wm, xa             | xd = xa - (wm x wn)                                                                                                           |       |
| 4          | Unsigned multiply and neg long              | UMNEGL   | UMNEGL xd, wn, wn                 | Xd = -(wm x wn)                                                                                                               |       |
|            | Signed multiply and add long                | SMADDL   | SMADDL xd, wn, wm, xa             | xd = xa + (wm x wn)                                                                                                           | +     |
|            | Signed multiply and sub long                | SMSUBL   | SMSUBL xd, wn, wm, xa             | xd = xa - (wn x wn)                                                                                                           |       |
|            | Signed multiply and neg long                | SMNEGL   | SMNEGL xd, wn, wm                 | Xd = - (wm x wn)                                                                                                              | +     |
|            | Unsigned divide                             | UDIV     | UDIV rd, rn, rm                   | rd = rn / rm                                                                                                                  |       |
|            | Signed divide                               | SDIV     | SDIV rd, rn, rm                   | rd = rn / rm                                                                                                                  | +     |
|            | Note: the remainder may be compu            |          |                                   |                                                                                                                               | +     |
|            |                                             |          | The rison thistraction as humerat | (quottent x denominator)                                                                                                      |       |
|            | bBitwise AND                                | AND      | AND{S} rd, rn, op2                | rd = rn & op2                                                                                                                 | {Yes} |
|            | Bitwise AND with neg                        | BIC      | BIC{S} rd, rn, op2                | rd = rn & ~op2                                                                                                                | {Yes} |
| Sus        | Bitwise OR                                  | ORR      | ORR rd, rn, op2                   | rd = rn   op2                                                                                                                 |       |
| atic       | Bitwise OR with neg                         | ORN      | ORN rd, rn, op2                   | rd = rn   ~op2                                                                                                                |       |
| ber        | Bitwise XOR                                 | EOR      | EOR rd, rn, op2                   | rd = rn ⊕ op2                                                                                                                 |       |
| ا م        | Bitwise XOR with neg                        | EON      | EON rd, rn, op2                   | rd = rn ⊕ ~op2                                                                                                                |       |
| ica        | Logical shift left                          | LSL      | LSL rd, rn, op2                   | Logical shift left (stuffing zeros enter from right)                                                                          |       |
| log        | Logical shift right                         | LSR      | LSR rd, rn, rm                    | Logical shift right (stuffing zeros enter from left)                                                                          |       |
| se         | Arithmetic shift right                      | ASR      | ASR rd, rn, op2                   | Arithmetic shift right (preserves sign)                                                                                       |       |
| itwi       | Rotate right                                | ROR      | ROR rd, rn, op2                   | Rotate right (carry not involved)                                                                                             |       |
| 9          | Move to register                            | MOV      | MOV rd, op2                       | rd = op2                                                                                                                      |       |
|            | Move to register, neg                       | MVN      | MVN rd, op2                       | rd = ~op2                                                                                                                     |       |
|            | Test bits                                   | TST      | TST rn, op2                       | rn & op2                                                                                                                      | Yes   |
| sdo        | Bitfield insert                             | BFI      | BFI rd, rn, #lsb, #width          | Moves a bitfield of #width bits starting at source bit 0 to destination starting at bit #lsb                                  |       |
| Field      | Bitfield extract<br>Signed bitfield extract | UBFX     | UBFZ rd, rn, #lsb, #width         | Moves a bitfield of #width bits starting at source bit #lsb to destination starting at bit 0; clears all other rd bits        |       |
| Bit        | Signed bitfield extract                     | SBFX     | SBFZ rd, rn, #lsb, #width         | Moves a bitfield of #width bits starting at source bit #lsb to destination starting at bit $\theta$ ; sign extends the result |       |
|            | Count leading sign                          | CLS      | CLS rd, rm                        | Count leading sign bits                                                                                                       |       |
| obs        | Count leading sign                          | CLZ      | CLZ rd, rm                        | Count leading zero bits                                                                                                       |       |
| te o       | Reverse bit                                 | RBIT     | RBIT rd, rm                       | Reverse bit order                                                                                                             |       |
|            | Reverse byte                                | REV      | REV rd, rm                        | Reverse byte order                                                                                                            |       |
| Bit,       | Reverse byte in half word                   | REV16    | REV16 rd, rm                      | Reverse byte order on each half word                                                                                          |       |
|            | Reverse byte in word                        | REV32    | REV32 xd, xm                      | Reverse byte order on each word                                                                                               |       |
|            | Store single register                       | STR      | rt, [addr]                        | Mem[addr] = rt                                                                                                                |       |
| SI         | Subtype byte                                | STRB     | wt, [addr]                        | Byte[addr] = wt<7:0>                                                                                                          |       |
| operations | Subtype half word                           | STRH     | wt, [addr]                        | HalfWord[addr] = wt<15:0>                                                                                                     |       |
| erat       | Store register pair                         | STP      | STP rt, rm, [addr]                | Stores rt and rm in consecutive positions starting at addr                                                                    |       |
|            | Load single register                        | LDR      | LDR rt, [addr]                    | rt = Mem[addr]                                                                                                                |       |
| Store      | Sub-type byte                               | LDRB     | LDRB wt, [addr]                   | wt = Byte[addr] (only 32-byte containers)                                                                                     |       |
|            | Sub-type signed byte                        | LDRSB    | LDRSB rt, [addr]                  | rt = Sbyte[addr] (signed byte)                                                                                                |       |
| and        | Sub-type half word                          | LDRH     | LDRH wt, [addr]                   | wt = HalfWord[addr] (only 32-byte containers)                                                                                 |       |
| Load       | Sub-type signed half word                   | LDRSH    | LDRSH rt, [addr]                  | rt = Mem[addr] (load one half word, signed)                                                                                   |       |
| Ľ          | Sub-type signed word                        | LDRSW    | LDRSW xt, [addr]                  | xt = Sword[addr] (signed word, only for 64-byte containers)                                                                   |       |
|            | Load register pair                          | LDP      | LDP rt, rm, [addr]                | Loads rt and rm from consecutive positions starting at addr                                                                   |       |
|            |                                             |          |                                   |                                                                                                                               |       |

|                 | Instruction                        | Mnemonic  | Syntax                      | Explanation                                  | Flags |
|-----------------|------------------------------------|-----------|-----------------------------|----------------------------------------------|-------|
| obs             | Branch                             | В         | B target                    | Jump to target                               |       |
|                 | Conditional branch                 | B.CC      | B.cc target                 | If (cc) jump to target                       |       |
| -anch           | Compare and branch if zero         | CBZ       | CBZ rd, target              | If (rd=0) jump to target                     |       |
| 윰               | Compare and branch if not zero     | CBNZ      | CBNZ rd, target             | If (rd≠0) jump to target                     |       |
|                 | Conditional select                 | CSEL      | CSEL rd, rn, rm, cc         | If (cc) rd = rn else rd = rm                 |       |
| Su              | with increment,                    | CSINC     | CSINC rd, rn, rm, cc        | If (cc) rd = rn else rd = rm+1               |       |
| oper            | with negate,                       | CSNEG     | CSNEG rd, rn, rm, cc        | If (cc) rd = rn else rd = -rm                |       |
|                 | with invert                        | CSINV     | CSINV rd, rn, rm, cc        | If (cc) rd = rn else rd = ~rm                |       |
|                 | Conditional set                    | CSET      | CSET rd, cc                 | If (cc) rd = 1 else rd = 0                   |       |
|                 | with mask,                         | CSETM     | CSETM rd, cc                | If (cc) rd = -1 else rd = 0                  |       |
| lg;t            | with increment,                    | CINC      | CINC rd, rn, cc             | If (cc) rd = rn+1 else rd = rn               |       |
| S               | with negate,                       | CNEG      | CNEG rd, rn, cc             | If (cc) rd = -rn else rd = rn                |       |
|                 | with invert                        | CINV      | CINV rd, rn, cc             | If (cc) rd = ~rn else rd = rn                |       |
|                 | Compare                            | СМР       | CMP rd, op2                 | Rd - op2                                     | Yes   |
| ops Conditional | with negative                      | CMN       | CMN rd, op2                 | rd - (-op2)                                  | Yes   |
| are             | Conditional compare                | ССМР      | CCMP rd, rn, #im4, cc       | If (cc) NZCV = CMP(rd,rn) else NZCV = #imm4  | Yes   |
| Comp            | with negative                      | CCMN      | CCMP rd, rn, #im4, cc       | If (cc) NZCV = CMP(rd,-rn) else NZCV = #imm4 | Yes   |
|                 | Note: for these instructions rn ca | n also be | an #im5 (5-bit unsigned imm | ediate value 032)                            |       |

# Aarch64 accessory information

| C | onditi             | on codes (magnitude of operands) |                  |
|---|--------------------|----------------------------------|------------------|
| Γ | LO Lower, unsigned |                                  | C = 0            |
|   | ΗI                 | Higher, unsigned                 | C = 1 and Z = 0  |
| Г | LS                 | Lower or same, unsigned          | C = 0 or Z = 1   |
|   | HS                 | Higher or same, unsigned         | C = 1            |
|   | LT                 | Less than, signed                | N != V           |
| Г | GT                 | Greater than, signed             | Z = 0 and N = V  |
|   | LE                 | Less than or equal, signed       | Z = 1 and N != V |
|   | GE                 | Greater than or equal, signed    | N = V            |

| Con | dition codes (direct flags) |       |
|-----|-----------------------------|-------|
| EQ  | Equal                       | Z = 1 |
| NE  | Not equal                   | Z = 0 |
| MI  | Negative                    | N = 1 |
| PL  | Positive or zero            | N = 0 |
| VS  | Overflow                    | V = 1 |
| VC  | No overflow                 | V = 0 |
| cs  | Carry                       | C = 0 |
| СС  | No carry                    | C = 1 |

|      | Sub typ | es (suffix of some instruction | ıs)     |
|------|---------|--------------------------------|---------|
|      | B/SB    | byte/signed byte               | 8 bits  |
| H/SH |         | half word/signed half word     | 16 bits |
|      | W/SW    | word/signed word               | 32 bits |

| Fla | Flags set to 1 when:                                                  |  |  |  |  |  |
|-----|-----------------------------------------------------------------------|--|--|--|--|--|
| N   | the result of the last operation was negative, cleared to 0 otherwise |  |  |  |  |  |
| Z   | the result of the last operation was zero, cleared to 0 otherwise     |  |  |  |  |  |
| С   | the last operation resulted in a carry, cleared to 0 otherwise        |  |  |  |  |  |
| ٧   | the last operation caused overflow, cleared to 0 otherwise            |  |  |  |  |  |

| Sizes, | in Assembly and C |           |
|--------|-------------------|-----------|
| 8      | byte              | char      |
| 16     | Half word         | short int |
| 32     | word              | int       |
| 64     | double word       | long int  |
| 128    | quad word         | -         |

| Addressing modes (base: n | egister; offset: register or immediate)    |                  |  |  |
|---------------------------|--------------------------------------------|------------------|--|--|
| [base]                    | MEM[base]                                  |                  |  |  |
| [base, offset]            | MEM[base+offset]                           | MEM[base+offset] |  |  |
| [base, offset]!           | MEM[base+offset] then base = base + offset | (pre indexed)    |  |  |
| [base] offset             | MEM[base] then base = base + offset        | (nost indexed)   |  |  |

| Calling convention (register use)         |
|-------------------------------------------|
| Params: X0X7; Result: X0                  |
| Reserved: X8, X16X18 (do not use these)   |
| Unprotected: X9X15 (callee may corrupt)   |
| Protected: Y19 Y28 (callee must preserve) |

| Op2 processing (applied to | Op2 before anything else)                     |
|----------------------------|-----------------------------------------------|
| LSL LSR ASR #imm           |                                               |
| SXTW / SXTB {#imm2}        | Sign extension/Sign extension after LSL #imm2 |

## Aarch64 floating point instructions

### General concepts and conventions

Registers: Di (double precision: 64-bit, c:double), Si (single precision: 32-bit, c:float), Hi (half precision: 16-bit, c:non standard), i:0..31

Call convention: R0..R7 - arguments, R0 - result; R={D,S,H}; R8:R15 should be preserved by callee

Containers: r = {D,S,H}; #immn = n-bit constant

| Instruct                              | tion                                                                           | Mnemonic                             | Syntax                                                                                   | Explanation                                                                                               | Fla      |  |
|---------------------------------------|--------------------------------------------------------------------------------|--------------------------------------|------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|----------|--|
| Addition                              | n                                                                              | FADD                                 | FADD rd, rn, rm                                                                          | rd = rn + rm                                                                                              | Y        |  |
| Subtract                              | tion                                                                           | FSUB                                 | FSUB rd, rn, rm                                                                          | rd = rn - rm                                                                                              | Y        |  |
| Multiply                              | у                                                                              | FMUL                                 | FMUL rd, rn, rm                                                                          | rd = rn x rm                                                                                              | Y        |  |
| Multiply                              | y and neg                                                                      | FNMUL                                | FNMUL rd, rn, rm                                                                         | $Rd = - (rn \times rm)$                                                                                   | Y        |  |
| Multiply<br>Multiply                  | y and add                                                                      | FMADD                                | FMADD rd, rn, rm, ra                                                                     | rd = ra + (rn x rm)                                                                                       | Y        |  |
| Multiply                              | y and add neg                                                                  | FNMADD                               | FNMADD rd, rn, rm, ra                                                                    | Rd = - (ra + (rn x rm))                                                                                   | Y        |  |
| Multiply                              | y and sub                                                                      | FMSUB                                | FMSUB rd, rn, rm, ra                                                                     | rd = ra - (rn x rm)                                                                                       | ١        |  |
| Multiply                              | y and sub neg                                                                  | FNMSUB                               | FNMSUB rd, rn, rm, ra                                                                    | rd = (rn x rm) - ra                                                                                       | ١        |  |
| Divide                                |                                                                                | FDIV                                 | FDIV rd, rn, rm                                                                          | rd = rn / rm                                                                                              | 1        |  |
| Negation                              | n                                                                              | FNEG                                 | FNEG rd, rn                                                                              | rd = - rn                                                                                                 | 1        |  |
| Absolute                              | e value                                                                        | FABS                                 | FABS rd, rn                                                                              | rd =  rn                                                                                                  | ١        |  |
| Absolute<br>Maximum<br>Minimum        |                                                                                | FMAX                                 | FMAX rd, rn, rm                                                                          | rd = max(rn,rm)                                                                                           | Ι,       |  |
| Minimum                               |                                                                                | FMIN                                 | FMIN rd, rn, rm                                                                          | rd = min(rn,rm)                                                                                           | ,        |  |
| Square r                              | root                                                                           | FSQRT                                | FSQRT rd, rn                                                                             | rd = sqrt(rn)                                                                                             | ١,       |  |
| Round to                              | o integer                                                                      | FRINTI                               | FRINTI rd, rn                                                                            | Rd = round(rn)                                                                                            | ,        |  |
|                                       | Note: r={D,S,H} but operands and result must be of same type                   |                                      |                                                                                          |                                                                                                           |          |  |
|                                       | registers of equal size                                                        | FMOV                                 | FMOV rd, rn                                                                              | rd = rn                                                                                                   | $\vdash$ |  |
| , immed                               | diate FP constant                                                              | FMOV                                 | FMOV rd, #immf                                                                           | rd = immf (can be in scientific notation, ex: 1.2; 5.0e-3)                                                |          |  |
| i                                     | registers and memory                                                           | LDR/STR                              | LDR/STR rt, [addr]                                                                       | rt = Mem[addr]; Mem[addr] = rt (scaled address)                                                           | _        |  |
| unsca                                 | aled address offset                                                            | •                                    | LDR/STR rt, [addr]                                                                       | rt = Mem[addr]; Mem[addr] = rt (unscaled address)                                                         |          |  |
| <u> </u>                              | ore pair of registers                                                          | LDP/STP                              | LDP rt, rm, [addr]                                                                       | Load/store rt and rm from/to consecutive addresses                                                        | L        |  |
|                                       |                                                                                | FCSEL                                | FCSEL rd, rn, rm, cc                                                                     | If (cc) rd = rn else rd = rm                                                                              |          |  |
| Notes                                 |                                                                                |                                      | on may lead to rounding or NaN<br>; use instead SCVTF or UCVTF t                         | o convert an integer zero to FP                                                                           |          |  |
|                                       |                                                                                | FCMP                                 | FCMP rn, rm                                                                              | NZCV = compare(rn,rm)                                                                                     | ١        |  |
| Compare                               |                                                                                |                                      |                                                                                          |                                                                                                           |          |  |
| Compare<br>with                       | zero                                                                           | FCMP                                 | FCMP rd, #0.0                                                                            | NZCV = compare(rn,0)                                                                                      | ١        |  |
| with                                  | zero<br>onal compare                                                           | FCMP<br>FCCMP                        | FCMP rd, #0.0 FCCMP rn, rm, #imm4, cc                                                    | NZCV = compare(rn,0)  If (cc) NZCV = compare(rn,rm) else NZCV = #imm4                                     | $\vdash$ |  |
| with<br>Condition                     | onal compare                                                                   | FCCMP                                | FCCMP rn, rm, #imm4, cc                                                                  |                                                                                                           | $\vdash$ |  |
| with<br>Condition                     | onal compare                                                                   | FCCMP                                | FCCMP rn, rm, #imm4, cc                                                                  | If (cc) NZCV = compare(rn,rm) else NZCV = #imm4                                                           | ,        |  |
| with<br>Condition                     | onal compare<br>: comparison of FP numbers ca                                  | FCCMP                                | FCCMP rn, rm, #imm4, cc<br>wrong conclusions on very simi                                | If (cc) NZCV = compare(rn,rm) else NZCV = #imm4 lar operands due to rounding errors                       | ,        |  |
| with Condition Note: Between signed i | onal compare<br>: comparison of FP numbers ca<br>FP registers                  | FCCMP<br>an lead to                  | FCCMP rn, rm, #imm4, cc<br>wrong conclusions on very simi                                | If (cc) NZCV = compare(rn,rm) else NZCV = #imm4  lar operands due to rounding errors  rd = rn (r={D,S,H}) | ,        |  |
| with Condition Note: Between signed i | onal compare<br>: comparison of FP numbers ca<br>FP registers<br>integer to FP | FCCMP<br>In lead to<br>FCVT<br>SCVTF | FCCMP rn, rm, #imm4, cc<br>wrong conclusions on very simi<br>FCVT rd, rn<br>SCVTF rd, rn | If (cc) NZCV = compare(rn,rm) else NZCV = #imm4  lar operands due to rounding errors  rd = rn             | ,        |  |

## Aarch64 Advanced SIMD instructions (NEON) - work in progress

#### General concepts and conventions

Vector Registers: Vi (128-bit – quadword), i:0..31; each register can be divided in lanes of {8,16,32,64} bits ({B,H,S,D})

Syntax for register structure: Vi.nk, i=register number, k=lane type {B,H,S,D}, n=number of lanes; nk={8B,16B,4H,8H,2S,4S,1D,2D}

 $\label{thm:syntax} \mbox{Syntax for register element: Vi.k[n], i=register number, k=lane type \{B,H,S,D\}, n=element number \} \mbox{Syntax for register element: Vi.k[n], i=register number, k=lane type \{B,H,S,D\}, n=element number \} \mbox{Syntax for register element: Vi.k[n], i=register number, k=lane type \{B,H,S,D\}, n=element number \} \mbox{Syntax for register element: Vi.k[n], i=register number, k=lane type \{B,H,S,D\}, n=element number \} \mbox{Syntax for register element: Vi.k[n], i=register number, k=lane type \{B,H,S,D\}, n=element number \} \mbox{Syntax for register element: Vi.k[n], i=register number, k=lane type \{B,H,S,D\}, n=element number \} \mbox{Syntax for register element: Vi.k[n], i=register number, k=lane type \{B,H,S,D\}, n=element number \} \mbox{Syntax for register element: Vi.k[n], i=register number, k=lane type \{B,H,S,D\}, n=element: N=register number, k=lane type \{B,H,S,D\}, n=element: N=register$ 

Examples: V3.4S = V3 structured in 4 lanes of 32 bits; V5.B[0] = rightmost byte of V5 (least significant byte)

Scalar Registers (Scl): Qi(128-bit), Di(64-bit), Si(32-bit), Hi(16-bit), Bi(8-bit); Register space shared with FP registers (e.g. V0.D[0]=D0)

Instructions: not necessarily new mnemonics but new syntax and behaviour: can operate vectors, scalars and in some cases vectors with scalars

Examples: ADD W0,W1,W2 (signed integer addition); ADD V0.4S,V1.4S,V2.4S (signed 4-component integer vector addition)

The following tables contain only new instructions. Most of the classic instructions are still valid but adopt the new syntax and behaviour

|             | Instruction                                                                                                                           | Mnemonic    | Syntax                          | Explanation                                                |                  | Flag |  |
|-------------|---------------------------------------------------------------------------------------------------------------------------------------|-------------|---------------------------------|------------------------------------------------------------|------------------|------|--|
|             | Duplicate vector element                                                                                                              | DUP         | DUP Vd.nk, Vs.k[m]              | Replicate single Vs element to all elements of Vo          |                  |      |  |
| ً ا يـ      | scalar element                                                                                                                        | DUP         | DUP Vd.nk, Scl                  | Replicate scalar Scl to all elements of Vd (S=lst          | oits of {X,W})   |      |  |
| ונו         | Insert vector element                                                                                                                 | INS         | <pre>INS Vd.k[i], Vs.r[j]</pre> | Copy element r[j] of Vs to element k[i] of Vd              |                  |      |  |
| 2           | scalar element                                                                                                                        | INS         | INS Vd.k[i], Scl                | Copy scalar Scl to element k[i] of Vd (S=lsbits o          | of {X,W})        |      |  |
| ] و         | Note: 64-bit scalar can only be                                                                                                       | used with 6 | 4-bit lanes, 32-bit scalar ca   | nn be used with 32/16/8-bit lanes                          |                  |      |  |
| E [         | Signed move to scalar register                                                                                                        | SMOV        | SMOV Rd, Vn.T[i]                | Copy vector element to register, sign extended (c          | lim R > dim T)   |      |  |
|             | Unsigned                                                                                                                              | UMOV        | UMOV Rd, Vn.T[i]                | Copy vector element to register, unsigned (dim R           | >= dim T)        |      |  |
| טַ          | Signed long (add as example)                                                                                                          | SADDL       | SADDL Vd.nk, Vs.nj, Vr.np       | k has double size of j/p (ex: SADDL V0.2D,V1.2S,V          | /2.2S)           |      |  |
| יייוווופייי | for higher lanes                                                                                                                      | SADDL2      | SADDL2 Vd.nk, Vs.nj, Vr.np      | The same, but taking the most significant lanes of         | of operands      |      |  |
| 5           | for wide operands                                                                                                                     | SADDW       | SADDW Vd.nk, Vs.nj, Vr.np       | k and j have the double size of p                          |                  |      |  |
| ē [         | wide operands, higher lanes                                                                                                           | SADDW2      | SADDW2 Vd.nk, Vs.nj, Vr.np      | The same, but taking the most significant lanes o          | of op2           |      |  |
| 3           | Narrow operands (sub as example)                                                                                                      | SUBHN       | SUBHN Vd.nk, Vs.nj, Vr.np       | The same, but taking the most significant lanes of         | of op3           |      |  |
| ž           | Note: variants can have different suffixes - {L,W,N,P} (long, wide, narrow, pairing) or suffixes {SQ,UQ} (unsigned/signed saturating) |             |                                 |                                                            |                  |      |  |
|             | Add across lanes                                                                                                                      | ADDV        | ADDV Scl, Vs.nk                 | Add all elements of Vs into a scalar (ex: ADDV SG          | ), V2.4S)        |      |  |
| 5           | Signed long add across lanes                                                                                                          | SADDLV      | SADDLV Scl, Vs.nk               | The same but dim(Scl) larger than k (ex: SADDLV X0, V2.4S) |                  |      |  |
| שבחחרו      | Signed maximum across lanes                                                                                                           | SMAXV       | SMAXV Scl, Vs.nk                | Maximum goes to scalar Scl                                 |                  |      |  |
| על כ        | minimum                                                                                                                               | SMINV       | SMINV Scl, Vs.nk                | Minimum goes to scalar Scl                                 |                  |      |  |
|             | Note: prefix {U,S,F} defines data type (ex: FMINV finds the minimum element of an FP vector)                                          |             |                                 |                                                            |                  |      |  |
| ע           | Compare bitwise vector                                                                                                                | CMcc        | CMcc Vd.nk, Vn.nj, Vm.np        | if true Vd.k[i]=-1 (all ones) else Vd.k[i]=0               | cc={EQ,HS,GE,HI} |      |  |
| Compar      | with zero                                                                                                                             | CMcc        | CMcc Vd.nk, Vn.nj, #0           | Compare bitwise vector with zero                           | cc={EQ,HS,GE,HI} | _    |  |